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Abstract 

The effect that climate change and variability will have on waterborne bacteria is a topic of increasing concern for coastal 
ecosystems, including the Chesapeake Bay. Surface water temperature trends in the Bay indicate a warming pattern of 
roughly 0.3-0.4°C per decade over the past 30 years. It is unclear what impact future warming will have on pathogens 
currently found in the Bay, including Vibrio spp. Using historical environmental data, combined with three different 
statistical models of Vibrio vulnificus probability, we explore the relationship between environmental change and predicted 
Vibrio vulnificus presence in the upper Chesapeake Bay. We find that the predicted response of V. vulnificus probability to 
high temperatures in the Bay differs systematically between models of differing structure. As existing publicly available 
datasets are inadequate to determine which model structure is most appropriate, the impact of climatic change on the 
probability of V. vulnificus presence in the Chesapeake Bay remains uncertain. This result points to the challenge of 
characterizing climate sensitivity of ecological systems in which data are sparse and only statistical models of ecological 
sensitivity exist. 
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Introduction 

Vibrio spp. bacteria are a threat in many coastal aquatic 
ecosystems around the world [1—4], In the Chesapeake Bay, the 
number of annual human Vibrio cases of infection has nearly 
doubled in the past decade [5,6] . Furthermore, Vibrio spp. is 
frequently detected in shellfish harvested for human consumption 
during the warm summer months [7], In general, this seasonality 
correlates with peak incidence of Vibrio disease caused by Vibrio 
spp. bacteria in many coastal regions [8-10]. The probability of 
finding various Vibrio spp. in the Bay varies spatially and 
seasonally, and researchers have modeled these probability 
patterns as a statistical function of surface water temperature 
and salinity [11-16]. These temperature and salinity-based Vibrio 
models have demonstrated skill for available datasets in the Bay 
and structurally similar statistical models have been applied to 
predictions of V. cholerae , V vulnificus, and V parahaemohticus in 
other regions [1,4,17,18]. The environmental range of V. vulnificus 
can vary by region, but in general the bacteria are found in waters 
with salinity between 5 and 25 (practical salinity units) and 
temperature above 15°C [12,19-21]. 

Recent studies show that surface water temperatures in the 
Chesapeake Bay have warmed by 0.3-0.4°C per decade over the 
past 30 years [22,23]. This trend has resulted in an expansion of 
the warm season period during which water temperatures are high 
enough to support V. vulnificus growth: the onset of spring time 


temperatures (>15°C) has advanced by nearly three weeks [22]. 
Salinity patterns are also sensitive to climate change, as changes in 
springtime flow of the Susquehanna River - the primary 
freshwater input to the Bay - can influence salinity throughout 
the Bay over the V. vulnificus growth season. The consensus of 
climate models is that there will be a rise in winter and spring 
precipitation in the northern portion of the watershed [24,25] 
implying an increase in January to May Susquehanna River steam 
flow. A study by Gibson and Najjar (2000; [26]) showed that an 
increase in the January-May Susquehanna stream flow could 
potentially decrease winter and springtime salinity values by 7% in 
the upper Chesapeake Bay. 

Even though there is considerable uncertainty in the magnitude 
of projected warming and freshening of the Chesapeake Bay [27], 
it is valuable to understand how a temperature and salinity 
sensitive pathogen like V. vulnificus might respond to observed and 
projected trends in these environmental parameters. Here we 
examine three statistical models of V. vulnificus probability of 
presence that demonstrate skill in predicting V. vulnificus probabil- 
ity of presence in Chesapeake Bay. All three models use water 
surface temperature and salinity as the only predictors, but they 
differ in their structure and/or in the data used for training and 
evaluation. One model is the generalized linear model (GLM) of 
Jacobs et al. (2010; [12]) trained on data collected in the 
Chesapeake Bay in 2007 and 2008, the second model is also a 
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GLM but trained on a 2011 and 2012 data set, while the third 
model is generalized additive model (GAM) also trained on the 
2011-2012 data. The latter two models are from Urquhart et al. 
(2014; [15]), where surface temperature and salinity were used to 
model both the probability of presence and concentration Vibrio 
spp. in the upper Chesapeake Bay. Here we focus just on 
probability models to enable comparison with Jacobs et al. (2010; 
[12]), who considered only a probability model. Furthermore, we 
can evaluate the effect that these differences in structure and 
training data have on modeled estimates of V. vulnificus probability 
under current climate conditions, which is relevant for pathogen 
risk assessment and early warning, and consider the implications of 
these differences for projected V. vulnificus risk under climate 
change. 

Methods 

The Chesapeake Bay Estuary, adjacent to the Maryland, 
Delaware, and Virginia coastline, covers an area of approximately 
1 1,500 knT and is characterized by a sharp north-to-south salinity 
gradient. Salinity ranges from 0-6 in the northern Bay to 18-30 
near the mouth of the Bay. Surface water temperatures follow a 
seasonal cycle, ranging from local wintertime temperatures of 
— 0.5°C to summertime temperatures of 31°C [28]. The 
Susquehanna River, the largest and northernmost tributary, 
accounts for roughly 45% of the yearly freshwater inflow into 
the Bay. This paper focuses on the upper portion of Chesapeake 
Bay (Fig. 1). The upper region of the Bay was selected to avoid 
model predictions outside of the original training data salinity 
range (salinity >14). 

The climatological analysis presented here used historical 
environmental data collected by the Chesapeake Bay Data 
Program [29]. Bi-monthly surface water temperature, salinity, 
and chlorophyll a data were obtained for 16 main stem and 
tributary monitoring stations (Fig. 1) collected from 1985 through 
2013. For salinity, the absolute difference between observed 
salinity and the V. vulnificus optimal salinity value of 1 1.5 [12] was 
calculated, and use of deviation from this was used as an 



Figure 1. Map of the study area, showing contours of average 
surface water salinity. Dark markers represent in situ monitoring 
stations used for each of the subregions in this study: upper (star), mid 
(circle), and lower (square). 
doi:1 0.1 371 /journal. pone.0098256.g001 


explanatory covariate. The 16 monitoring stations were selected 
based on their geographic location serving as a representation of 
the upper Chesapeake Bay. In situ data were used to delineate 
three different salinity zones: upper-upper Bay (hereafter: ’’upper 
region”), middle-upper Bay (hereafter: ”mid region”), and lower- 
upper Bay (hereafter: ’’lower region”). These stations cover the 
upper main-stem Bay as well as tributary locations, with six 
stations in the upper region, five stations in the mid region, and 
five stations in the lower region. Observational data were averaged 
at monthly intervals for each zone resulting in 337 data records for 
the upper region and 342 data records for both the mid and lower 
regions. 

These salinity and temperature data were applied to the three 
statistical V. vulnificus probability models available for Chesapeake 
Bay: 

1. NOAA_GLM: The generalized linear model (GLM) of Jacobs 
et al. (2010; [12]): [z(V.v) — fi 0 + / l 1 Temp+ ffifi SalnOpt | , where 
fio is the intercept, /?„ is the regression coefficient for the 
independent covariates, Temp is daily surface temperature, and 
| SalnOpt | is the absolute distance from optimal salinity of 1 1.5], 
which was trained using 235 V vulnificus samples collected 
during the months of July and October of 2007, and April, 
July, and October of 2008 and were analyzed by the NOAA 
Chesapeake Bay Office. 

2. JHU_GLM: The GLM of Urquhart et al., (2014; [15]) of the 

same structural form as the NOAA_GLM \z(V.v ) — + 

fijTemp + PfiSalnOpt\\ trained using 148 V. vulnificus , surface 
temperature, and surface salinity samples collected in the upper 
Chesapeake Bay during the months of July and September of 
2011 and March through June of 2012 (Table SI; [15]). 
Samples were collected by The Johns Hopkins University 
(JHU) in collaboration with the Maryland Department of the 
Environment and NASA and were processed at the University 
of Maryland College Park. 

3. JHU_GAM: A generalized additive model (GAM; [30]) trained 
and evaluated using the same data that were used for 
JHU_GLM: \z(V.\ •>) — p ( f\-s I (Temp)+ S 2 (Saln), where sfixf) is a 
parameter of the smoothing function, and Sain is the salinity 
value] . 

We included a GAM in addition to the two structurally identical 
GLMs because GAM models allow a more flexible regression 
modeling of the transformed response that combine the predictor 
variables in a nonparametric manner [31]. The NOAA_GLM and 
JHU_GLM models have the same structure but vary in the 
training data, whereas the JHU_GLM and JHU_GAM models 
use the same training data but vary in the structure of model. .All 
models were implemented in logistic form using a “logit” link 
function for an optimal prediction point and were trained using 
observational bacteria data transformed to binary presence/ 
absence. Probability of V. vulnificus presence was calculated using 
p—e z /{ 1+/). Diagnostics for each model were performed using 
Akaike’s Information Criterion (AIC) and accuracy (ACC) in an 
out-of-bag (OOB) cross validation [32]. ACC is defined as ACC 
= (TP+TN)/(P+N) where TP is true positive, TN is true negative, 
P is the number of presence instances, and N is the number of 
absence instances. 

To explore sensitivity of the V. vulnfiicus models to temperature 
and salinity, we used a range of surface water temperature (0— 
40°C) and surface salinity (0-13) values as independent model 
input. Here the range of model input extends past the range of the 
in situ temperature (8-3 1°C) observations, and is constrained to 
the range of the surface salinity (0-14) observations that were used 
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Figure 2. Contour plots of V. vulnificus probability with temperature and salinity for (a) NOAA GLM, (b) JHU GLM, and (c) JHU GAM. 

Black dots represent monthly average (April-July) of in situ conditions; black lines represents in situ trend line, and dashed line represents shift in 
present day temperature and salinity, (d) Plot of temperature regressed against V.vulnificus probability at 1 1.5 salinity for each empirical method. 
Green circles represent the range of temperature observations during bacterium sampling. 
doi:1 0.1 371/journal.pone.0098256.g002 


in the training of the two JHU models. Additionally, historical 
temperature and salinity data were tested as model input, enabling 
identification of V vulnificus climatology and seasonal trends. To 
further assess the geographic distribution of the predicted V. 
vulnificus probability for each method, geospatially-interpolated 
satellite-derived surface temperature and surface salinity [33,34] 
were used to map spatially complete estimates of probability 
throughout the upper Bay. Interpolated satellite estimates were 
developed using monthly, level-2 Moderate Resolution Imaging 
Spectroradiometer (MODIS) surface water temperature (MOD 
28) and ocean color (Rrs 412-678) products. 

All statistical computations were carried out in the R Statistical 
Environment version 2.14, using the ‘mgcv’ and ‘stats’ packages, 
on an Intel Xeon W3580 Processor, 3.33 GHz machine with 12 
GB RAM. Computation time for all statistical models was less 
than one minute. 

Results and Discussion 

For model evaluation, goodness of fit and predictive skill for the 
JHU models were determined using AIC and ACC indices. AIC 
results indicated that the JHU GAM (145.9) offered better model 
fit than the JHU GLM (160.4), but performance differences 
between models were small relative to measurement uncertainty. 
NOAA GLM model fit using the NOAA training dataset yielded 
an AIC of 164.3 [12]). A direct comparison of model fit of could 
not be calculated due to lack of access to NOAA GLM training 
data. We stress that the difference in training data between the 


NOAA and JHU models is the primary reason for differences 
between NOAA_GLM and JHU_GLM, as the models are 
structurally identical. To predict bacterial presence, selection of 
an optimal prediction point was required. Rather than setting a 
prediction point at 0.5 arbitrarily, the prediction point was based 
on three performance indices: true positive rate, true negative rate, 
and ACC, yielding an optimal threshold of 0.4 for V. vulnificus. To 
determine the prediction skill of each model, ACC was calculated 
using the JHU validation dataset (ACC: 0.47, for NOAA GLM, 
0.59 for JHU GLM, and 0.60 for JHU GAM). The AIC and ACC 
values indicated that the JHU models performed significantly 
better than a null model that only included seasonality as a 
predictor. 

Figure 2 shows the relationship between temperature, salinity, 
and the mean estimate of predicted V. vulnificus probability for each 
of the tested models, with likelihood levels plotted as contour 
curves. NOAA GLM (Fig. 2a) exhibits a sharp increase in V. 
vulnificus probability with increasing temperatures along the axis of 
optimal salinity (11.5). Similarly, JHU GLM (Fig. 2b) exhibits a 
steady increase in probability with higher temperatures, though 
the rate of change with temperature is less steep than NOAA 
GLM. In contrast to the GLMs, JHU GAM (Fig. 2c) shows a 
probability maximum dependent on temperature, indicating a 
temperature optimum V vulnificus growth above which probability 
gradually declines. Figure 2d offers an alternative view of 
predicted V. vulnificus probability with temperature, at optimal 
salinity, including temperature observations during in situ bacteria 
collection. Furthermore, the wide range of observed temperatures 
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Figure 3. Monthly climatology of temperature and V. vulnificus probability for each method in the upper (a), mid (b), and lower (c) 
regions of the Chesapeake Bay. Peak temperature observations by year versus V. vulnificus probability for each method in the upper (d), mid (e), 
and lower (f) regions of the Chesapeake Bay. Trend lines are included for each method's observations. 
doi:1 0.1 371 /journal. pone.0098256.g003 


confirms that the declining GAM probability above optimal 
temperature is a valid model response and not an issue of limited 
observations at high temperature. 

These differences in model response also have implications for 
retrospective or near real-time estimation of risk of V. vulnificus 
presence. Using a 27-year in situ record of temperature and 
salinity in the upper Chesapeake Bay, we estimated V. vulnificus 
monthly probability of presence according to each statistical 
model. Fig. 3 shows the climatology of surface water temperature 
and mean estimate model predictions in each region of the upper 


Bay for March through November. A southward increase in 
predicted probabilities for all statistical methods during summer 
months suggests that distance from optimal salinity plays a role in 
the spatial distribution of V vulnficus presence. Predicted 
probabilities are likely lower in the upper region due to decreased 
salinity and larger deviation from optimal salinity. Seasonal 
patterns in all regions indicate that NOAA_GLM and JHU_GLM 
predict highest probabilities during the warmest summertime 
months. JHU_GAM exhibits a bimodal seasonal pattern with 
peaks in early and late summer across all regions. These 
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Figure 4. Monthly climatology of Chlorophyll a and V. vulnificus probability for each method averaged over the entire upper 
Chesapeake Bay. 
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JHU_GAM results are consistent with the temperature depen- 
dency shown in Fig. 2c. 

The difference in model sensitivity to temperature has 
implications for characterizing interannual variability in risk. 
Fig. 3d-f show mean predicted V vulnificus probability for the 
upper, middle, and lower portions of the study area plotted against 
annual peak monthly SST for the available historical record. In all 
three subregions, NOAA_GLM predicts that peak probabilities 
were highest in warmer years, while JHU_GAM predicts the 
opposite and JHU_GLM falls in between. We emphasize that 
these are the mean probability estimates for each model, and that 
there may not be statistically significant differences between model 
predictions in any given year. Nevertheless, mean estimates are 
commonly used to communicate risk and to project trends, so the 
fact that two comparably high performing models - NOAA_GLM 
and JHU_GAM - yield opposite mean estimates of the 
relationship between warm summers and V. vulnificus probability 
is relevant. 

The differences in these model response surfaces also have clear 
implications for projections of V. vulnificus probability under 
climate change. As a simple demonstration, we consider the 
consensus prediction of warming and freshening of the Bay 
(dashed line in Figure 2 a-c). NOAA_GLM projects steady or 
increasing probabilities: freshening moves conditions away from 
the salinity optimum but this effect is offset by increases in 
predicted probability with rising water temperatures. The 
JFIU_GLM shows a similar pattern but with lower sensitivity to 
changing environmental conditions. In contrast, warming only 
increases predicted probability of V vulnificus presence in 
JHU_GAM for relatively cool temperatures, representative of 
spring or fall conditions. Peak summertime temperatures are 


already above the temperature optimum in this model, so further 
warming results in a predicted decline in peak summertime V. 
vulnificus probability. 

While we cannot presently determine which sensitivity pattern is 
correct — the JHU_GLM and NOAA_GLM increase with higher 
temperatures or the JHU_GAM decline under warmest condi- 
tions — the JHU_GAM behavior might indicate that present-day 
summertime water temperatures are already above the optimal 
temperature for V vulnificus growth in Chesapeake Bay. Alterna- 
tively, the result might be understood in the context of previous 
studies that have shown Vibrio dependence on zooplankton due to 
attachment and/or Vibrio’s chitinoclastic activity [35,36]. Unfor- 
tunately we do not have adequate co-located measurements of 
zooplankton and V. vulnificus to include zooplankton in a predictive 
model. However, we do find that the climatology of Chesapeake 
Bay Program in situ chlorophyll a concentrations, which generally 
correlate with zooplankton presence, exhibits a bimodal seasonal 
pattern with a slight lead over the JHU GAM predicted V vulnificus 
peaks (Fig. 4). 

To examine the geographic extent of each methods’ predicted 
V vulnificus probability, monthly interpolated satellite surface water 
temperature and surface salinity products were used to create 
spatially complete probability hind-casts for 2012 in the upper Bay 
(Fig. 5). Consistent with results shown in Fig. 3, these maps show 
highest predicted probability towards the south of the analysis 
region, where salinity values are closest to optimum. 
NOAA_GLM and JHU_GLM both show the most widespread 
zones of high probability in the warmest summer months, while 
JHU_GAM predicts higher probabilities at the beginning and end 
of the warm season. Interesting spatial structures are also apparent 
in these maps. For example, NOAA_GLM predicts slightly 
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Figure 5. Upper Chesapeake Bay monthly V. vulnificus proba- 
bility hind-casts for April through October 2012, for NOAA 
GLM, JHU GLM, and JHU GAM methods. 
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elevated V. vulnificus probabilities in the eastern waters of the 
Chesapeake Bay during warmer months, while JHU_GAM 
predicts high probability zones in the western Bay during months 
with lower overall probability (Fig. 5). These patterns likely reflect 
the Bay’s two-layer physical circulation scheme in which we see 
fresher surface waters along the western shore and saltier waters 


along the eastern shore of the Bay. The predictions of statistical V. 
vulnificus probability models compared in this study clearly differ in 
the implied relationships between the structure of this circulation 
and the location of high V. vulnificus risk areas. 

Conclusions 

In summary, this study presents a comparison of three statistical 
ecological habit models for estimating the probability of V. 
vulnificus presence in the upper Chesapeake Bay. We examined 
individual model sensitivity to climatic variability and change 
within the upper Bay by assessing model response to a range of 
temperature and salinity values. We find that the three models 
differ systematically in the predicted response of V. vulnificus 
probability to high temperatures in the upper Chesapeake Bay. 

These results indicate that more data are required to constrain 
estimates of climate sensitivity of V vulnificus in Chesapeake Bay: 
statistical models are limited by the paucity of publicly available 
data from V vulnificus collections and co-located measurements of 
ecologically relevant variables, and process-based models would 
require further research on the V vulnificus life cycle in the Bay. 
Our results also caution against predicting or projecting climate- 
based changes in V. vulnfiicus exposure risk on the basis of the mean 
predictions of existing statistical models, as skillful and statistically 
indistinguishable models differ systematically in predicted V. 
vulnificus sensitivity to rising surface water temperature, even 
within the range of environmental conditions under which the 
models were trained. 

The challenges facing V. vulnificus modeling in Chesapeake Bay 
are not unique. Indeed, predictive capabilities for climate 
sensitivity of many pathogens are limited to statistical models 
based on scarce data. Other recent studies [37-38] emphasize that 
the inadequacy of available data hamper climate change 
projections for a diversity of waterborne pathogen systems in 
many regions. In the case of V. vulnificus in Chesapeake Bay we 
have a specific example of closely related modeling efforts that 
suggest systematically different impacts of climate change due to 
differences in model structure — i.e., the difference between 
JHU_GLM and JHU_GAM — and training data — i.e., the differ- 
ence between JHU_GLM and NOAA_GLM. These kinds of 
structural comparisons of statistical models, however, are not 
always performed in studies of climate sensitivity in ecological 
systems. The results of this study suggest that such model 
comparisons can be quite important when evaluating uncertainty 
in climate-based predictions and projections. 

Supporting Information 

Table SI In situ training dataset used for JHU_GLM and 
JHU_GAM likelihood models. 
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